This first part of the exercises only deals with importing data. Later, in the second exercise, we will turn more to non-flat data files and label them before exporting.

1

Load the titanic data.
The file format is CSV. Accordingly, you may want to use the readr library and a function that starts with read_... if you are not aiming at using base R.

You may have noticed that the function you just imports factor variables as characters by default. For some analyses, this may not be what we want (for example, if we’re going to use “sex” as a predictor in a regression).

2

Convert the variable sex to a factor.
You can do that while importing the data or after loading them.

We have already worked with the Titanic data quite a bit. Let’s import some other data for a change.

3

Load the gapminder GDP data. Use the .xlsx file in the data folder.
We need the readxl package for this importing task.

As you may have noticed, the format of the output of the two importing functions is the same (tibbles in both cases). Sometimes, however, the contents of an Excel file are not that easy to import. We will illustrate this with the help of the Unicorns on Unicycles dataset. This is what is known about this data according to its creator:

The documents were recently unearthed from a hidden chest in Delft and seem to be written by Rudolphus Hogervorstus, my great great great uncle, in 1681. These documents show that he was a scientist studying the then roaming herds of unicorns in the area around Delft. Unfortunately these animals are extinct now. His work contains multiple tables, carefully written down, documenting the population of unicorns over time in multiple places and related to that the sales and numbers of unicycles in those countries. According to Rudolphus the unicorn populations and unicycles are related “The presence of the cone on the unicorn hints at a very defined sense of equilibrium, it is therefore only natural to assume unicorns ride unicycles”. As part of the archival process these tables were copied, as Rudolphus himself would say: “with the black magic, so vile it could not be discussed for hell would come descent upon us” into satans own spawn: Microsoft Excel.

Source: https://github.com/RMHogervorst/unicorns_on_unicycles

4

Load the unicorn sales data file. As we are not interested in the total_turnover variable only read in the cell range A1:C43
You can define ranges with the argument range = range_definition (see ?read_excel).